tests: Add the cargo-fuzz fuzzing suite#36982
Draft
def- wants to merge 2 commits into
Draft
Conversation
This was referenced Jun 12, 2026
def-
added a commit
that referenced
this pull request
Jun 15, 2026
Hardens the `mz-avro` decoder against adversarial input (Avro bytes/schemas arrive from Kafka and an external registry, so a panic/OOM is an availability bug): bound per-block array/map lengths and object counts by remaining input, cap object-container block byte length, bound schema-parse/value-decode recursion, and fix two schema-resolution panics on unmatched named types. Found by the cargo-fuzz suite ([separate infra PR](#36982)). Each fix has a regression test. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
def-
added a commit
that referenced
this pull request
Jun 16, 2026
Print→reparse round-trip bugs in the SQL parser and pretty-printer, surfaced by the grammar-aware fuzz target. Each fix has a regression test; the sqllogictest/testdrive plan goldens are refreshed to match. Themes: quoting bare keyword identifiers (any/all/some/list, context-sensitive keywords), parenthesizing low-precedence operands (prefix ops, casts, COLLATE, quantified comparisons), special-form display correctness (EXTRACT/POSITION/SUBSCRIBE), and bounding parser recursion/backtracking to reject pathological inputs. Found by the cargo-fuzz suite ([separate infra PR](#36982)). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
7f0a2e2 to
4380619
Compare
def-
added a commit
that referenced
this pull request
Jun 22, 2026
…37177) `RowPacker::push_array_with_unchecked` and `push_array_with_row_major` compute an array's expected cardinality as the product of its dimension lengths and compare it against the actual number of elements pushed. The product was an unchecked `usize` multiply (`dims.iter()...product()` / `cardinality *= dim.length`), so dimension lengths whose product exceeds `usize::MAX` overflowed. Under overflow checks (debug, and the cargo-fuzz build) this panics; in release it silently wraps, and a wrapped value can even spuriously match the actual element count — accepting a corrupt array (e.g. dims claiming `[2^32, 2^32]` wrap to a cardinality of 0, matching an empty element list). This is reachable from `Row::decode` over an attacker- or corruption-supplied `ProtoRow`, since the proto array dimensions are not otherwise bounded. Saturate the product to `usize::MAX` instead. An overflowing cardinality is impossibly large — no array can hold that many elements — so it never equals the real element count and the existing check rejects it as `WrongCardinality`, turning the panic/silent-wrap into a clean error on both build profiles. Found by the `repr::row_codec_roundtrip` cargo-fuzz target in #36982 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
def-
added a commit
that referenced
this pull request
Jun 22, 2026
All found via #36982 --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
def-
added a commit
that referenced
this pull request
Jun 23, 2026
Hardens `mz-repr` proto/Row decoding against malformed/untrusted bytes — these paths are reachable from persisted state and the wire, so a panic is an availability bug. Replaces panics/asserts with proper decode errors and validates ranges (Date, CheckedTimestamp, ProtoNumeric, ProtoRange, ProtoRelationDesc, ProtoRow dict ordering, uuid parsing, leap-second truncation, Avro decimal). Found by the cargo-fuzz suite ([separate infra PR](#36982)). Each fix has a regression test. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Moritz Hoffmann <antiguru@gmail.com>
def-
added a commit
that referenced
this pull request
Jun 25, 2026
…nput (#36985) Hardens decode paths in `persist`, `persist-client`, `persist-types`, `pgrepr`, and `postgres-util` against malformed/untrusted input: reject rollup-less/empty-frontier/hollow-only/invalid StateDiff state instead of panicking or debug-asserting, guard UUID parsing, fix i16 overflow in pgrepr numeric-scale binary decode, and propagate u16-conversion errors in the postgres table-desc protos. Found by the cargo-fuzz suite ([separate infra PR](#36982)). Each fix has a regression test. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
36e969a to
a53a715
Compare
Squashes every fuzzing-infrastructure change in this branch into one commit, separate from the individual bug fixes the fuzzing surfaced (one commit each): * The cargo-fuzz crates under `src/*/fuzz` — targets, seed corpora, dictionaries, and `prepare-corpus.sh` scripts — covering the SQL parser / pretty-printer, repr (strconv, jsonb, Row codec/proto, arithmetic oracles), the expr optimizer transforms, Avro/Protobuf/CSV/pgwire/pgcopy decoders, pgrepr/pgtz, the upsert state machine, persist durable-state decode, and the proto round-trips across storage-types/persist/catalog/external table descs. * The harness and runner wiring: `--profile fruitful`, `--jobs auto`, per-crate sharding, artifact-based crash detection, `.repro.txt` sidecars, a time-capped post-fuzz corpus minimize/upload, and the auto-generated `buf.yaml` fuzz-crate excludes. * CI: move cargo-fuzz from nightly to release qualification (24h, 48-core). * The production-side enablement the targets require: the `fuzzing` Cargo feature and the `#[doc(hidden)]` / `cfg`-gated re-exports that expose upsert, persist-client, and pgwire internals to the fuzz crates. * The macOS build fix those exports necessitated: switching the affected storage `Stream::inspect` calls to `InspectCore::inspect_container`, which avoids the objc2-driven trait-solver overflow the `Inspect` bound triggers on macOS. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes: CLU-137
Fixes two user-visible bugs in how `reduce` treats an erroring operand of
a non-strict AND/OR:
* A materialized view or query whose `mz_now()` temporal filter sits under
an OR of ANDs, e.g.
WHERE (a AND ... AND mz_now() < t) OR (b AND ... AND mz_now() < t)
could panic the compute worker (or fail to plan) with "Unsupported
temporal predicate". The shared `mz_now() < t` conjunct was left buried
inside the OR instead of being factored out, so temporal-filter
extraction failed.
* A query that should short-circuit, such as `WHERE col AND (1/0 = 1)`,
could spuriously fail at runtime (e.g. "division by zero") even on rows
where `col` is false. AND/OR are non-strict: `false AND <error>` is
`false` and `true OR <error>` is `true`, so such rows must be filtered,
not errored.
Mechanism: the generic variadic fold in `reduce` replaced a call with any
operand's literal error unconditionally, which is wrong for a function that
is not strict in errors. Fold only for the strict variadics, excluding
AND/OR and ErrorIfNull (which can absorb an operand's error at runtime).
Keeping the erroring operand then reaches `undistribute_and_or`, which
recombines operands across AND/OR's short-circuit boundary. That only
preserves error semantics for operands common to every disjunct, so skip
undistribution otherwise. A shared temporal predicate (`mz_now() < t`,
whose cast can error) is common to every disjunct, so it is still factored
out and stays extractable, unlike the reverted MaterializeInc#37049's blunt
`could_error()` skip.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds the cargo-fuzz suite: fuzz targets, seed corpora, dictionaries, and the runner/CI integration (release-qualification, 24h). Covers the SQL parser/pretty-printer, repr (strconv, jsonb, Row codec/proto, arithmetic oracles), the expr optimizer transforms, the Avro/Protobuf/CSV/pgwire/pgcopy decoders, pgrepr/pgtz, the upsert state machine, persist durable-state decode, and proto round-trips across storage-types/persist/catalog/external table descs.
Also includes the production-side enablement the targets require (the
fuzzing/fuzzCargo features and#[doc(hidden)]/cfg-gated re-exports) and the macOS build fixes those exposures necessitated.This is the infrastructure PR — mostly mechanical (generated corpora/dicts). The individual bugs it surfaced are split into separate per-subsystem PRs.
Depends on:
INTOidentifiers soCOPYrelations round-trip #37128